31 research outputs found

    Exploiting natural selection to study adaptive behavior

    Get PDF
    The research presented in this dissertation explores different computational and modeling techniques that combined with predictions from evolution by natural selection leads to the analysis of the adaptive behavior of populations under selective pressure. For this thesis three computational methods were developed: EXPLoRA, EVORhA and SSA-ME. EXPLoRA finds genomic regions associated with a trait of interests (QTL) by explicitly modeling the expected linkage disequilibrium of a population of sergeants under selection. Data from BSA experiments was analyzed to find genomic loci associated with ethanol tolerance. EVORhA explores the interplay between driving and hitchhiking mutations during evolution to reconstruct the subpopulation structure of clonal bacterial populations based on deep sequencing data. Data from mixed infections and evolution experiments of E. Coli was used and their population structure reconstructed. SSA-ME uses mutual exclusivity in cancer to prioritize cancer driver genes. TCGA data of breast cancer tumor samples were analyzed.status: publishe

    EXPLoRA-web: linkage analysis of quantitative trait loci using bulk segregant analysis

    Get PDF
    Identification of genomic regions associated with a phenotype of interest is a fundamental step toward solving questions in biology and improving industrial research. Bulk segregant analysis (BSA) combined with high-throughput sequencing is a technique to efficiently identify these genomic regions associated with a trait of interest. However, distinguishing true from spuriously linked genomic regions and accurately delineating the genomic positions of these truly linked regions requires the use of complex statistical models currently implemented in software tools that are generally difficult to operate for non-expert users. To facilitate the exploration and analysis of data generated by bulked segregant analysis, we present EXPLoRA-web, a web service wrapped around our previously published algorithm EXPLoRA, which exploits linkage disequilibrium to increase the power and accuracy of quantitative trait loci identification in BSA analysis. EXPLoRA-web provides a user friendly interface that enables easy data upload and parallel processing of different parameter configurations. Results are provided graphically and as BED file and/or text file and the input is expected in widely used formats, enabling straightforward BSA data analysis. The web server is available at http://bioinformatics.intec.ugent.be/explora-web/

    SSA-ME Detection of cancer driver genes using mutual exclusivity by small subnetwork analysis

    Get PDF
    Because of its clonal evolution a tumor rarely contains multiple genomic alterations in the same pathway as disrupting the pathway by one gene often is sufficient to confer the complete fitness advantage. As a result, many cancer driver genes display mutual exclusivity across tumors. However, searching for mutually exclusive gene sets requires analyzing all possible combinations of genes, leading to a problem which is typically too computationally complex to be solved without a stringent a priori filtering, restricting the mutations included in the analysis. To overcome this problem, we present SSA-ME, a network-based method to detect cancer driver genes based on independently scoring small subnetworks for mutual exclusivity using a reinforced learning approach. Because of the algorithmic efficiency, no stringent upfront filtering is required. Analysis of TCGA cancer datasets illustrates the added value of SSA-ME: well-known recurrently mutated but also rarely mutated drivers are prioritized. We show that using mutual exclusivity to detect cancer driver genes is complementary to state-of-the art approaches. This framework, in which a large number of small subnetworks are being analyzed in order to solve a computationally complex problem (SSA), can be generically applied to any problem in which local neighborhoods in a network hold useful information

    Combined burden and functional impact tests for cancer driver discovery using DriverPower

    Get PDF
    The discovery of driver mutations is one of the key motivations for cancer genome sequencing. Here, as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, which aggregated whole genome sequencing data from 2658 cancers across 38 tumour types, we describe DriverPower, a software package that uses mutational burden and functional impact evidence to identify driver mutations in coding and non-coding sites within cancer whole genomes. Using a total of 1373 genomic features derived from public sources, DriverPower's background mutation model explains up to 93% of the regional variance in the mutation rate across multiple tumour types. By incorporating functional impact scores, we are able to further increase the accuracy of driver discovery. Testing across a collection of 2583 cancer genomes from the PCAWG project, DriverPower identifies 217 coding and 95 non-coding driver candidates. Comparing to six published methods used by the PCAWG Drivers and Functional Interpretation Working Group, DriverPower has the highest F1 score for both coding and non-coding driver discovery. This demonstrates that DriverPower is an effective framework for computational driver discovery

    Improved linkage analysis of Quantitative Trait Loci using bulk segregants unveils a novel determinant of high ethanol tolerance in yeast

    Get PDF
    Background: Bulk segregant analysis (BSA) coupled to high throughput sequencing is a powerful method to map genomic regions related with phenotypes of interest. It relies on crossing two parents, one inferior and one superior for a trait of interest. Segregants displaying the trait of the superior parent are pooled, the DNA extracted and sequenced. Genomic regions linked to the trait of interest are identified by searching the pool for overrepresented alleles that normally originate from the superior parent. BSA data analysis is non-trivial due to sequencing, alignment and screening errors. Results: To increase the power of the BSA technology and obtain a better distinction between spuriously and truly linked regions, we developed EXPLoRA (EXtraction of over-rePresented aLleles in BSA), an algorithm for BSA data analysis that explicitly models the dependency between neighboring marker sites by exploiting the properties of linkage disequilibrium through a Hidden Markov Model (HMM). Reanalyzing a BSA dataset for high ethanol tolerance in yeast allowed reliably identifying QTLs linked to this phenotype that could not be identified with statistical significance in the original study. Experimental validation of one of the least pronounced linked regions, by identifying its causative gene VPS70, confirmed the potential of our method. Conclusions: EXPLoRA has a performance at least as good as the state-of-the-art and it is robust even at low signal to noise ratio's i.e. when the true linkage signal is diluted by sampling, screening errors or when few segregants are available

    Occurrence and identification of microplastics along a beach in the Biosphere Reserve of Lanzarote

    Full text link
    This work studied the accumulation of plastic debris in a remote beach located in La Graciosa island (Chinijo archipelago, Canary Islands). Microplastics were sampled in the 1–5 mm mesh opening range. An average plastic density of 36.3 g/m2 was obtained with a large variability along the 90 m of the beach (from 8.5 g/m2 to 103.4 g/m2). Microplastic particles preferentially accumulated in the part of the beach protected by rocks. A total number of 9149 plastic particles were collected, recorded and measured, 87% of which corresponded to fragments. Clear colours and microscopic evidence of weathering corresponded to aged plastics wind-driven by the surface Canary Current. The chemical composition of plastics particles corresponded to PE (63%), PP (32%) and PS (3%). Higher PE/PP ratios were recorded in the more protected parts of the beach, suggesting preferential accumulation of more aged fragment

    Occurrence and identification of microplastics along a beach in the Biosphere Reserve of Lanzarote

    Get PDF
    This work studied the accumulation of plastic debris in a remote beach located in La Graciosa island (Chinijo archipelago, Canary Islands). Microplastics were sampled in the 1&#-5&;8239#mm mesh opening range. An average plastic density of 36.3 g/m2 was obtained with a large variability along the 90 m of the beach (from 8.5 g/m2 to 103.4 g/m2). Microplastic particles preferentially accumulated in the part of the beach protected by rocks. A total number of 9149 plastic particles were collected, recorded and measured, 87% of which corresponded to fragments. Clear colours and microscopic evidence of weathering corresponded to aged plastics wind-driven by the surface Canary Current. The chemical composition of plastics particles corresponded to PE (63%), PP (32%) and PS (3%). Higher PE/PP ratios were recorded in the more protected parts of the beach, suggesting preferential accumulation of more aged fragments

    Author Correction:Retrospective evaluation of whole exome and genome mutation calls in 746 cancer samples (Nature Communications, (2020), 11, 1, (4748), 10.1038/s41467-020-18151-y)

    Get PDF
    The original version of this Article omitted from the author list the 9th author Yize Li, who is from the ‘The McDonnell Genome Institute at Washington University, St. Louis, MO 63108, USA and Department of Medicine, Division of Oncology, Washington University School of Medicine, St. Louis, MO 63108, USA’. This has been corrected in both the PDF and HTML versions of the Article

    Pathway and network analysis of more than 2500 whole cancer genomes.

    Get PDF
    The catalog of cancer driver mutations in protein-coding genes has greatly expanded in the past decade. However, non-coding cancer driver mutations are less well-characterized and only a handful of recurrent non-coding mutations, most notably TERT promoter mutations, have been reported. Here, as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, which aggregated whole genome sequencing data from 2658 cancer across 38 tumor types, we perform multi-faceted pathway and network analyses of non-coding mutations across 2583 whole cancer genomes from 27 tumor types compiled by the ICGC/TCGA PCAWG project that was motivated by the success of pathway and network analyses in prioritizing rare mutations in protein-coding genes. While few non-coding genomic elements are recurrently mutated in this cohort, we identify 93 genes harboring non-coding mutations that cluster into several modules of interacting proteins. Among these are promoter mutations associated with reduced mRNA expression in TP53, TLE4, and TCF4. We find that biological processes had variable proportions of coding and non-coding mutations, with chromatin remodeling and proliferation pathways altered primarily by coding mutations, while developmental pathways, including Wnt and Notch, altered by both coding and non-coding mutations. RNA splicing is primarily altered by non-coding mutations in this cohort, and samples containing non-coding mutations in well-known RNA splicing factors exhibit similar gene expression signatures as samples with coding mutations in these genes. These analyses contribute a new repertoire of possible cancer genes and mechanisms that are altered by non-coding mutations and offer insights into additional cancer vulnerabilities that can be investigated for potential therapeutic treatments
    corecore